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Amendments to tine Claims: 

This listing of claims will replace all prior versions, and listings, of claims in 
the application: 

Listing of Claims: 

1. (Previously presented) A method, comprising: 
identifying a loop in a program; 

identifying each vector memory reference in the loop; 

detennining dependencies between vector memory references In the loop, 
including determining unidirectional and circular dependencies; 

distributing the vector memory references into a plurality of detail loops 
configured to allocate the vector memory references into a plurality 
of temporary arrays, sized and located, so that none of the vector 
memory references are cache synonyms, wherein the vector 
memory references that have circular dependencies therebetween 
are included in a common detail loop, and wherein the detail loops 
are ordered according to the unidirectional dependencies between 
the memory references; 

analyzing an execution profile of the program after said distributing; and 

based on the execution profile, determining whether to repeat said 
identifying a loop, said identifying each vector memory reference, 
said determining dependencies, and said distributing. 

2. (Original) A method, as set forth in claim 1, further comprising allocating a 
plurality of temporary storage areas within a cache and detemnining the size of 
each temporary storage area based on the size of the cache and the number of 
temporary storage areas. 

3. (Original) A method, as set forth in claim 1, further comprising at least one 
section loop including the plurality of detail loops. 
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4. (Original) A method, as set forth in claim 1. wherein distributing the vector 
memory references into a plurality of detail loops further comprises distributing 
the vector memory references into a plurality of detail loops that each contain at 
least one vector memory reference that could benefit from cache management. 

5. (Original) A method, as set forth in claim 1, further comprising Inserting 
cache management instructions into at least one of said detail loops to control 
movement of data associated with the vector memory reference between a cache 
and main memory. 

6. (Original) A method, as set forth in claim 1, further comprising inserting 
prefetch instructions into at least one of said detail loops to control movement of 
data associated with the vector memory reference between a cache and main 
memory. 

7. (Original) A method, as set forth in claim 1, further comprising performing 
loop unrolling on at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory. 

8. (Original) A method, as set forth in claim 1 , further comprising Inserting at 
least one of a prefetch instruction and a cache management instruction into at 
least one of said detail loops to control movement of data associated with the 
vector memory reference between a cache and main memory, and performing 
loop unrolling on at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory. 

9. (Previously presented) A method, comprising: 
identifying a loop in a program; 

identifying each vector memory reference in the loop; 
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determining dependencies between vector memory references in the 
loop; and 

distributing tlie vector memory references into a plurality of detail loops 
that serially proceed through strips of the vector memory references 
and store the strips in temporary arrays so that none of the vector 
memory references are cache synonyms, wherein the vector 
memory references that have dependencies therebetween are 
included in a common detail loop; 

wherein said distributing the vector memory references into a plurality of 
detail loops is perfomned by a first computer for execution by a 
second computer. 

10. (Original) A method, as set forth in claim 9, further comprising allocating a 
plurality of temporary storage areas within a cache and detemnining the size of 
each temporary storage area based on the size of the cache and the number of 
temporary storage areas. 

1 1 . (Original) A method, as set forth in claim 9, further comprising at least one 
section loop including the plurality of detail loops. 

12. (Original) A method, as set forth in claim 9, wherein distributing the vector 
memory references into a plurality of detail loops further comprises distributing 
the vector memory references into a plurality of detail loops that each contain at 
least one vector memory reference that could benefit from cache management. 

13. (Original) A method, as set forth in claim 9, further comprising inserting 
cache management instructions into at least one of said detail loops to control 
movement of data associated with the vector memory reference between a cache 
and main memory. 
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14. (Original) A method, as set forth in claim 9, further comprising inserting 
prefetch instructions into at least one of said detail loops to control movement of 
data associated with the vector memory reference between a cache and main 
memory. 

15. (Original) A method, as set forth in claim 9, further comprising performing 
loop unrolling on at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory. 

16. (Original) A method, as set forth in claim 9, further comprising inserting at 
least one of a prefetch instruction and a cache management instruction into at 
least one of said detail loops to control movement of data associated with the 
vector memory reference between a cache and main memory, and performing 
loop unrolling on at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory. 

17. (Previously presented) A method, comprising: 
identifying a loop in a program; 

identifying each vector memory reference in the loop; 

determining dependencies between vector memory references in the loop; 

distributing the vector memory references into a plurality of detail loops in 
response to cache behavior and the dependencies between the 
vector memory references in the loop, wherein the detail loops 
cause storage of the vector memory references in temporary arrays 
that are allocated consecutively so that no temporary anrays 
elements are cache synonyms, 

wherein said identifying a loop, said identifying each vector memory 
reference, said detemnining dependencies between vector memory 
references and said distributing the vector memory references into 
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a plurality of detail loops produce code that is substantially 
Independent of a computer architecture; and 
performing code optimizations that are dependent on a computer 
architecture after said distributing. 

18. (Original) A method, as set forth in claim 17. wherein distributing the 
vector memory references further comprises distributing the vector memory 
references into the plurality of detail loops with each loop having at least one of 
the identified vector memory references. 

19. (Original) A method, as set forth in claim 17, further comprising 
determining dependencies between vector memory references In the loop, and 
wherein distributing the loop includes distributing the vector memory references 
into the plurality of detail loops, wherein the vector memory references that have 
circular dependencies therebetween are included in a common detail loop. 

20. (Original) A method, as set forth in claim 17, further comprising inserting 
cache management instructions into at least one of said detail loops to control 
movement of data associated with the vector memory reference between a cache 
and main memory, 

21. (Original) A method, as set forth in claim 17. further comprising inserting 
prefetch instructions into at least one of said detail loops to control movement of 
data associated with the vector memory reference between a cache and main 
memory, 

22. (Original) A method, as set forth in claim 17, further comprising performing 
loop unrolling on at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory. 

183617.01/1662.48400 Page 6 Of 1 2 HP PONO20030173M 



PACE 8/14 • RCVD AT 11/17/2005 5:39:13 PM [Eastern Standard Time] • SVR:USPTO-EFXRF-6/27 " DNIS: 2738300 * CSID: 71 32388008 * DURATION (mm-ss): 04-24 



11/17/2005 16:36 FAX 7132388008 



121009/014 



Appl. No. 09/785,143 

Amdt. dated Novdmber 17, 2005 

Reply to final Office action of July 27, 2005 

23. (Original) A method, as set forth in claim 17. further comprising inserting 
at least one of a prefetch instruction and a cache management instruction into at 
least one of said detail loops to control movement of data associated with the 
vector memory reference between a cache and main memory, and perfomiing 
loop unrolling on at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory. 

24. (Previously presented) A computer programmed to perform a method, 
comprising: 

identifying a loop in a program; 
identifying each vector memory reference in the loop; 
determining dependencies between vector memory references in the 
loop; and 

distributing the vector memory references into a plurality of detail loops 
configured to retrieve strips of the vector memory references and 
store the strips in temporary arrays, wherein the vector memory 
references that have circular dependencies therebetween are 
included in a common detail loop. 

wherein the temporary arrays are configured to simultaneously fit in a 
single cache bank. 

25. -32. (Canceled). 

33. (Previously presented) A method for reducing the likelihood of cache 
thrashing by software to be executed on a computer system having a cache, 
comprising: 

executing the software on the computer system; 

generating a profile indicating the manner in which the software uses the 
cache; 
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identifying a portion of the software that exhibits cache thrashing based on 
the profile data; and 

modifying the identified portion of the software to reduce the likelihood of 
cache thrashing by distributing cache synonyms into detail loops 
configured to allocate the cache synonyms into temporary storage 
areas, sized and located, to prevent cache thrashing, 

wherein said modifying occurs before optimizations that are based on an 
architecture of the computer system. 

34. (Previously presented) A method, as set forth in claim 33, wherein 
modifying the identified portion of the software to reduce the likelihood of cache 
thrashing further comprises: 

identifying a loop in the identified portion of the software; 

identifying each vector memory reference in the identified loop; 

determining dependencies between the vector memory references in the 
identified loop of the software, including detemiining unidirectional 
and circular dependencies; and 

reducing cache thrashing by distributing the vector memory references into 
a plurality of detail loops, wherein the vector memory references 
that have circular dependencies therebetween are included in a 
common detail loop, and wherein the detail loops are ordered 
according to the unidirectional dependencies between the memory 
references. 

35. (Previously presented) A method for reducing the likelihood of cache 
thrashing by software to be executed on a computer system having a cache, 
comprising: 

executing the software on the computer system; 

generating a profile indicating the manner in which the memory references 
of the software use the cache; 
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identifying a portion of the mennory references based on the profile, 

wherein the portion of the menrtory references is determined to 

cause cache thrashing; and 
reducing cache thrashing by distributing the portion of the memory 

references into distinct loops that allocate strips of the memory 

references into temporary arrays for execution, 
wherein the temporary arrays are configured to simultaneously fit in a 

single cache bank. 



36. (Previously presented) The computer of claim 24 wherein the temporary 
arrays are allocated consecutively such that no temporary array elements are 
cache synonyms. 

37. (Previously presented) The computer of claim 24 wherein the details loops 
are allocated into section loops that cause iterative execution of the detail loops 
based on a size of the strips. 



38. (Canceled). 

39. (Previously presented) The method of claim 35 wherein the temporary 
arrays are located and sized to reduce cache thrashing. 

40. (Previously presented) The method of claim 35 wherein the temporary 
an-ays are allocated consecutively and iteratively executed by a size of the strips. 



41 . (New) A compiler, comprising: 

means for identifying a loop in a program; 

means for identifying each vector memory reference in the loop; 

means for determining dependencies between vector memory references 

in the loop, including determining unidirectional and circular 

dependencies; 
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means for distributing the vector memory references into a plurality of 
detail loops configured to serially process strips of the vector 
memory references so that thrashing does not occur, wherein the 
vector memory references that have circular dependencies 
therebetween are included in a common detail loop, and wherein 
the detail loops are ordered according to the unidirectional 
dependencies between the memory references; 

means for determining an execution profile of the program after said 
distributing occurs; and 

means for selectively repeating use of said means for identifying a loop, 
said means for identifying each vector memory reference, said 
means for determining dependencies, said means for distributing 
the vector memory references into a plurality of detail loops, and 
said means for detennining an execution profile based on said 
execution profile. 

42. (New) A compiler, as set forth in claim 41, further comprising means for 
allocating a plurality of temporary storage areas within a cache and determining 
the size of each temporary storage area based on the size of the cache and the 
number of temporary storage areas. 

43. (New) A compiler, as set forth in claim 41, wherein the means for 
distributing the vector memory references into a plurality of detail loops further 
comprises distributing the vector memory references into a plurality of detail loops 
that each contain at least one vector memory reference that could benefit from 
cache management. 

44. (New) A compiler, as set forth in claim 41, further comprising inserting 
cache management instructions into at least one of said detail loops to control 
movement of data associated with the vector memory reference between a cache 
and main memory, 
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45. (New) A compiler, as set forth in claim 41, further comprising means for 
inserting prefetch instructions into at least one of said detail loops to control 
movement of data associated with the vector memory reference between a cache 
and main memory. 

46. (New) A compiler, as set forth in claim 41, further comprising means for 
performing loop unrolling on at least one of said detail loops to control movement 
of data associated with the vector memory reference between a cache and main 
memory. 

47. (New) A compiler, as set forth in claim 41 , further comprising means for 
inserting at least one of a prefetch instruction and a cache management 
instruction into at least one of said detail loops to control movement of data 
associated with the vector memory reference between a cache and main 
memory, and performing loop unrolling on at least one of said detail loops to 
control movement of data associated with the vector memory reference t^etween 
a cache and main memory. 



163617.01/166Z48400 Page 1 1 Of 1 2 HPPDNO200301732.1 

PAGE ia/14 * RCVD AT 11/17/2005 5:39:13 PM [Eastern Standard Time] • SVR:USPTO-EFXRF-6/27 * DN1S:2738300 • CSID:71 32388008 " DURATION (mm-ss):04-24 



